Learning Features from Co-occurrences: A Theoretical Analysis
نویسنده
چکیده
Representing a word by its co-occurrences with other words in context is an effective way to capture the meaning of the word. However, the theory behind remains a challenge. In this work, taking the example of a word classification task, we give a theoretical analysis of the approaches that represent a word X by a function f(P(C|X)), where C is a context feature, P(C|X) is the conditional probability estimated from a text corpus, and the function f maps the co-occurrence measure to a prediction score. We investigate the impact of context feature C and the function f. We also explain the reasons why using the co-occurrences with multiple context features may be better than just using a single one. In addition, some of the results shed light on the theory of feature learning and machine learning in general.
منابع مشابه
Using $k$-way Co-occurrences for Learning Word Embeddings
Co-occurrences between two words provide useful insights into the semantics of those words. Consequently, numerous prior work on word embedding learning have used co-occurrences between two words as the training signal for learning word embeddings. However, in natural language texts it is common for multiple words to be related and co-occurring in the same context. We extend the notion of co-oc...
متن کاملThe Intellectual Structure of Knowledge in the Field of Distance Education Using the Co-Word analyses
Background: Co- word analysis is one of the content analysis methods used in scientometric studies and mapping the scientific structure of various fields. The purpose of the present research is to map the structure of distance education using the co-word analysis. Methods: The research method is content analysis using co- word analysis. The research population are 31607 documents indexed in the...
متن کاملHigher-order Co-occurrence Features based on Discriminative Co-clusters for Image Classification
Co-occurrence based image features have attracted keen attentions due to the promising performances for image classification tasks [1, 2, 3, 6, 7]. For extracting the co-occurrences, it is common to transform the quantitative data into qualitative data (symbols) by means of quantization (clustering) at first; e.g., continuous gradient orientation is coded into orientation bins [3], RGB colors a...
متن کاملUnsupervised statistical learning of higher-order spatial structures from visual scenes.
Three experiments investigated the ability of human observers to extract the joint and conditional probabilities of shape co-occurrences during passive viewing of complex visual scenes. Results indicated that statistical learning of shape conjunctions was both rapid and automatic, as subjects were not instructed to attend to any particularfeatures of the displays. Moreover, in addition to singl...
متن کاملMulti-Object Classification and Unsupervised Scene Understanding Using Deep Learning Features and Latent Tree Probabilistic Models
Deep learning has shown state-of-art classification performance on datasets such as ImageNet, which contain a single object in each image. However, multi-object classification is far more challenging. We present a unified framework which leverages the strengths of multiple machine learning methods, viz deep learning, probabilistic models and kernel methods to obtain state-of-art performance on ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1707.04218 شماره
صفحات -
تاریخ انتشار 2017